Statistical and machine learning approaches to predict the necessity for computed tomography in children with mild traumatic brain injury

Background Minor head trauma in children is a common reason for emergency department visits, but the risk of traumatic brain injury (TBI) in those children is very low. Therefore, physicians should consider the indication for computed tomography (CT) to avoid unnecessary radiation exposure to children. The purpose of this study was to statistically assess the differences between control and mild TBI (mTBI). In addition, we also investigate the feasibility of machine learning (ML) to predict the necessity of CT scans in children with mTBI. Methods and findings The study enrolled 1100 children under the age of 2 years to assess pre-verbal children. Other inclusion and exclusion criteria were per the PECARN study. Data such as demographics, injury details, medical history, and neurological assessment were used for statistical evaluation and creation of the ML algorithm. The number of children with clinically important TBI (ciTBI), mTBI on CT, and controls was 28, 30, and 1042, respectively. Statistical significance between the control group and clinically significant TBI requiring hospitalization (csTBI: ciTBI+mTBI on CT) was demonstrated for all nonparametric predictors except severity of the injury mechanism. The comparison between the three groups also showed significance for all predictors (p<0.05). This study showed that supervised ML for predicting the need for CT scan can be generated with 95% accuracy. It also revealed the significance of each predictor in the decision tree, especially the "days of life." Conclusions These results confirm the role and importance of each of the predictors mentioned in the PECARN study and show that ML could discriminate between children with csTBI and the control group.

These observations raise concerns that many of the CTs performed for this indication unnecessarily expose children to radiation, which is harmful in the long, leading to increased risk of secondary malignancies [20][21][22]. In particular, in many children, history, physical examination, and observation over a while are sufficient to rule out significant intracranial injury [23][24][25]. It is important for physicians in the emergency department to decide whether or not to perform CT for children with head trauma. Clinical decision rules such as PECARN have revealed an excellent algorithm to identify the children with clinically-important traumatic brain injury (ciTBI) and prevented many unnecessary head CT scans in children [26][27][28].
Artificial intelligence (AI) uses computer systems to simulate cognitive abilities to achieve goals. Machine learning (ML) classification is one of the domains of AI that enables an algorithm or classifier to learn patterns in large, complex datasets and produce useful predictive outputs. The number of published ML studies in neurosurgery is increasing [29][30][31][32][33]. Some of them have focused on the application of ML algorithms to support clinical decision-making in neurosurgery [30]. However, no studies have yet been published as to the use of ML to predict the necessity of CT in children with mTBI.
The purpose of this study was to clarify two issues regarding mTBI and the requirement of a CT scan. Firstly, we tried to statistically assess the differences in the predictors in the PECARN study between the control and the children with mTBI. Secondly, we evaluated the feasibility of ML to predict the necessity of CT scans in children with mTBI. intubation for more than 24 h, or hospital admission of 2 nights or more. Definition of mTBI on CT included intracranial hemorrhage or contusion, cerebral edema, traumatic infarction, diffuse axonal injury, shearing injury, sigmoid sinus thrombosis, midline shift of intracranial contents or signs of brain herniation, diastasis of the skull, pneumocephalus, and skull fracture depressed at least the width of the table of the skull. We defined clinically-significant TBI (csTBI) included ciTBI and mTBI on CT because of requiring at least hospital admission for observation or further treatment.
CT scans were obtained at the clinician's discretion with helical CT scanners, with radiographic slices separated by 5mm or less. Before the application of the PECARN criteria, criteria for performing CT scans in our hospital were based on physician judgment and caregiver preference, although children with impaired consciousness, a history of LOC, and a history of seizures were of course considered. CT scans were interpreted by site board-certified neurosurgeons.

Selection of predictors
Risk predictors were described based on those of the PECARN study [26], including gender, the severity of injury mechanism, history of loss of consciousness (LOC), LOC duration, history of vomiting, number of vomiting, acting abnormally per caregivers, Glasgow Coma Scale (GCS), altered mental status, signs of basilar skull fracture, palpable fracture, and scalp hematoma. Age was recorded in days in this study. Injury mechanisms were divided a priori into three categories [26]: severe, moderate, and mild. These predictors except for gender and days of life were categorized as shown in Table 1.

Data analysis
For a two-group comparison of the control and csTBI, an unpaired t-test and the Mann-Whitney U test were used to determine significance for parametric and non-parametric data, respectively. We also performed a three-group comparison among control, mTBI on CT, and ciTBI. For parametric and non-parametric data, unpaired (between groups) one-factor analysis of variance and multiple comparisons and multiple comparisons by Ryan's method using the Mann-Whitney U test were applied, respectively. All hypothesis tests were conducted against a 2-sided alternative. P value were considered statistically significant when less than .05.

Machine learning
Our primary analysis sought to understand the predictive accuracy of a local big-data-driven, machine learning approach based on the previously published clinical decision rules and traditional analytic techniques for classification. A decision tree was selected as the modern machine learning-based model. This study used python version 3.7 and its accompanying packages, implemented from packages such as Scikit-Learn. To predict csTBI based on predictors, we applied supervised ML (sML) using a program written in python. The decision tree method was used for the classification of the children. The accuracy of the algorithm was assessed by calculating the precision. The data for this study were divided into two sets: a training data set and a test data set. The training dataset accounted for 80% of the total data in the evaluation of the predictive model using machine learning. The performance of the predictive models was evaluated using Receiver Operating Characteristic (ROC) curves, specifically Area Under Curve (AUC). In order to investigate the risk of mTBI (csTBI) at a specific days of life, the outcome of mTBI (csTBI) was plotted against the days of life.
This study complies with the standards of the Declaration of Helsinki and the current ethical guidelines. The study also was approved by the institutional ethics board and by the IRB. Verbal consent was obtained from the caregivers for using the data. Table 2 showed the demographic characteristics in control, mTBI on CT, ciTBI, and csTBI. The female ratio and days of life in the control group were significantly higher than in mTBI on CT, ciTBI, and csTBI, respectively (Tables 3, 4).

Results
The ratio of CT obtained in all children was 26.0%, those of each group showed 21.9%, 100%, 100% in control, mTBI on CT, and ciTBI, respectively.

Group comparison
In the two-group comparison between control and csTBI, statistical significance was observed for all non-parametric predictors except for severity of injury mechanism (Table 3). Table 4 also showed the results in three-group comparisons for all parametric and non-parametric predictors. Based on the results, these predictors were divided into four classes (Table 5).

Prediction with machine learning
Supervised ML with a decision tree was applied to classify the children into two classes: control children who did not need a CT scan and children with csTBI who needed a CT scan. Fig 1a  showed the relationship between the maximum depth (max depth) of the tree and the area under the curve (AUC), revealing that the test data showed a peak AUC at the third depth, Abbreviations: mTBI, mild traumatic brain injury, history; CT, computed tomography; ciTBI, clinically important traumatic brain injury; csTBI, clinically significant traumatic brain injury.
https://doi.org/10.1371/journal.pone.0278562.t002 followed by a decreasing AUC. Therefore, we created an ML algorithm with this constraint and achieved an accuracy of 0.95 (Fig 1b). Fig 1c shows the relationship between the false positive rate (fpr) and the true positive rate (tpr) for max depths of 2, 3, and 10. In the setting of max depth 3, the accuracy of the training and test data was 0.961 and 0.955, respectively (Table 6). A comparison of the actual and predicted data showed that accuracy, precision, and F1 scores were 0.95, 0.95, and 0.95, respectively. The AUC was 0.85 in the max depth 3.

Discussion
This study identified two issues regarding the need for CT scans in children with minor head trauma. First, the statistical evaluation on predictors presented in the PECARN study [26] showed a significant difference between control and csTBI, mTBI on CT, or ciTBI, respectively. Secondly, the study showed that sML could be used to predict the necessity of a CT scan of the head with high accuracy for children with mTBI. This study also elucidated the importance of each predictor, especially days of life. Table 2 showed the demographic characteristics of children in control, mTBI on CT, ciTBI, and csTBI. In the two-group comparison between control and csTBI, there were statistical differences in days of life although gender showed no difference with p = 0.05 (Table 3). In the three-group comparison, the control group had significantly more days of life than mTBI on CT and ciTBI (Table 4), while there was no difference in days of life between mTBI on CT and ciTBI, or gender. The CT acquisition rate in this study was 26% of all children. This is lower than the 35% reported in the PECARN study [26]. Meanwhile, the CT acquisition in children with mTBI on CT and ciTBI were 100%, respectively. These findings were better than expected [27,[34][35][36].

The comparison regarding the non-parametric predictors
Comparison of the non-parametric predictors between the two groups showed that all predictors except severity of injury mechanism were significant between control and csTBI (Table 3). It means that this study also confirmed most of the predictors in the PECARN study were important to identify children with csTBI. Meanwhile, the non-parametric predictors could be subdivided into four classes to discriminate between the three groups of children: control, mTBI on CT, and ciTBI (Tables 4, 5). Gender and severity of injury mechanism were classified as class I, both of which showed no significance in comparisons between any two of the three groups (Table 5). Class II included days of life, history of vomiting, frequency of vomiting, and palpable skull fractures, which were found to be predictors for clarifying children with mTBI on CT and with ciTBI from control children. Conversely, the class II predictors could not discriminate between children with mTBI on CT and with ciTBI. In addition, history of LOC, LOC duration, and scalp hematoma were classified as class III and showed significance between control and ciTBI and between mTBI on CT and ciTBI, but not between control and mTBI on CT. Taken together, the class II predictors could identify children with csTBI, but it is hard to point out the severity of the head injury. Class III predictors may be used to identify more severe types of traumatic brain injury. All of the class IV predictors relating to consciousness were significant in all of the two-group comparisons among the three groups. In other words, the results suggested that predictors related to consciousness are important when considering the need for CT scans in children with head trauma. The PECARN study showed that six predictors were important: altered mental status, scalp hematoma, LOC, mechanism of injury, palpable skull fracture, and acting normally per parent. In particular, altered mental status and palpable skull fractures were associated with a higher risk of ciTBI. Suggested CT algorithm for children younger than 2 years elucidated that GCS 14 or altered mental status, and palpable skull fracture were the first predictors to pick up the children who require a CT scan [26]. They were classified as II and III in this study, suggesting these results were compatible with those in the PECARN study. In the second branch of the PECARN algorithm, scalp hematomas other than frontal, a history of LOC longer than 5 seconds, severe injury mechanism, and acting abnormally per parent were predictors of excluding children for whom CT was not recommended. These predictors were classified as class III and IV, except for severe injury mechanisms. This suggested that children with minor head trauma requiring CT scans may be picked up by a combination of class II and IV or class III and IV predictors [37-39]. To our best knowledge, this is the first implication that each predictor fulfills its role. The injury mechanism has been previously identified as an independent predictor of TBI [24,26,27,34,40]. Mechanisms associated with increased risk of TBI in children after blunt injury include high-speed motor vehicle accidents, bicycle-related injury, impact from the highspeed projectile, and fall from a height or downstairs [27,34,41]. Nigrovic et al. concluded that children with isolated severe injury mechanism at low risk of ciTBI, and many do not require emergent neuroimaging [42].

Prediction of the necessity of a CT of the head with sML
With sML using a decision tree method, the children with csTBI could be successfully identified from the control with a prediction accuracy score of 95% (Fig 1b). Fig 1d illustrated the importance of the predictors when creating the decision tree, revealing that days of life was the most important, followed by palpable skull fracture, and scalp hematoma. On the other hand, GCS and signs of basilar skull fracture showed less importance in this decision tree. Because decision trees are powerful and popular prediction methods, this study applied sML with the decision tree method. The final decision tree is very well suited for operational use because it can explain precisely why a particular prediction was made. Decision tree algorithms are known to overfit the training set. It is, therefore, critical to providing information on the performance of the training and test sets separately, as well as information on the parameter tuning of the algorithm such as grid search [43]. The prediction accuracy and AUC were maximized at a maximum depth of 3 when creating the sML algorithm for 2 class classification in this study (Fig 1a and 1c), the training and test achieved a high accuracy of 96.1% and 95.5%, respectively, under these conditions. Accuracy, precision, and F1 score were 0.95, 0.95, 0.95, respectively, which also indicated the effectiveness of the algorithm. We also attempted to use sML to identify children with mTBI on CT or ciTBI from control. Fig  2a showed that a decision tree could be created with sML, with a prediction accuracy score of 95% when applying the max depth 7. The ROC for mTBI on CT was indicated 0.85 as shown in Fig 2c, while the ROC for control and ciTBI showed moderately high. On analysis regarding the contribution of each predictor on the decision tree, days of life was the most significant for identifying the children of each classification (Figs 1d and 2c). Furthermore, day of life with different cutoff values was observed in many branches (Figs 1b and 2a). These findings suggest that days of life may be the most important factor to decide on obtaining CT scans for head trauma in children younger than 2 years of age, and that days of life could be used instead of age in general clinical decision rules. The days of life was employed in this study because we believe that children have important characteristics about time, especially when small children are the subject of clinical research. For example, a child who is 364 days old is to be 0 years old, and a child who is 365 days old is to be 1 year old, but it is natural to assume that there is no significant difference in terms of development and growth. In addition, Figs 1d and 2c revealed the importance of the predictors, such as scalp hematoma, palpable skull fracture, and altered mental status. These predictors were also key factors to identify the children requiring a CT scan in the PECARN algorithm. In the PECARN study, the prediction rule with normal mental status, no scalp hematoma except frontal, no LOC or LOC for less than 5 seconds, non-severe injury mechanism, no palpable skull fracture, and acting normally per caregivers had a negative predictive value of 100% and sensitivity of 100% [26]. To our best knowledge, this is the first expertise analysis that showed the feasibility of the sML to identify children with csTBI from control, and the significance of each predictor, especially days of life. However, Fig 3 could not show a characteristic relationship between the risk of mTBI and days of life.
This study indicated sML could be used to predict the necessity of a head CT regarding childhood mTBI. Although AI-based systems are powerful technologies [44-51], they should not replace the clinical judgment of physicians and medical teams [29][30][31][32][33]. The ideal role of these systems is as a data-driven input to the surgical decision-making process, designed to solve focused problems such as predicting the risk of mTBI in this study.

Limitation of this study
Regarding demographic characteristics, statistical differences were found between the control group and children with csTBI in two-or three-group comparisons, particularly concerning days of life. This issue may affect the interpretation of the results of this study. Future studies may need better demographic controls. CT scans were not performed on all children because we could not ethically justify exposing children to radiation. As with other decision support tools, these methods provide information to physicians and do not replace their decision-making [52]. In this study, decision tree method was applied to create sML algorithm, further studies using Random Forest, CatBoost and LightGBM, etc may be required for more precise analysis. [53,54]. Also only the parameters identified in the PECARN study were included in this study, other parameters should be included in the feature to obtain much benefits in performance.
Since the purpose of this study is to determine the feasibility of sML for the problem of CT scans of children with minor head trauma, strict scientific procedures such as under-sampling and bagging were not applied to resolve class imbalances. This issue should be resolved in future studies.

Conclusion
This study clarified two issues regarding the need for CT scans in children with minor head trauma. First, the evaluation on predictors in the PECARN showed there is a significant difference between control and csTBI, mTBI on CT, or ciTBI, respectively. Secondly, the study showed ML could be used to predict the necessity of a head CT with high accuracy for children with mTBI, and also elucidated the importance of each predictor, especially days of life. These results are substantial for ER physicians because they need to balance radiation exposure with the need to miss serious head trauma in children when they must decide if a child with minor head trauma needs a CT scan.